Networked Restless Bandits with Positive Externalities

نویسندگان

چکیده

Restless multi-armed bandits are often used to model budget-constrained resource allocation tasks where receipt of the is associated with an increased probability a favorable state transition. Prior work assumes that individual arms only benefit if they receive directly. However, many occur within communities and can be characterized by positive externalities allow derive partial when their neighbor(s) resource. We thus introduce networked restless bandits, novel bandit setting in which both embedded directed graph. then present Greta, graph-aware, Whittle index-based heuristic algorithm efficiently construct constrained reward-maximizing action vector at each timestep. Our empirical results demonstrate Greta outperforms comparison policies across range hyperparameter values graph topologies. Code appendices available https://github.com/crherlihy/networked_restless_bandits.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wireless Channel Selection with Restless Bandits

Wireless devices are often able to communicate on several alternative channels; for example, cellular phones may use several frequency bands and are equipped with base-station communication capability together with WiFi and Bluetooth communication. Automatic decision support systems in such devices need to decide which channels to use at any given time so as to maximize the long-run average thr...

متن کامل

Opportunistic Scheduling as Restless Bandits

In this paper we consider energy efficient scheduling in a multiuser setting where each user has a finite sized queue and there is a cost associated with holding packets (jobs) in each queue (modeling the delay constraints). The packets of each user need to be sent over a common channel. The channel qualities seen by the users are time-varying and differ across users; also, the cost incurred, i...

متن کامل

Particle Filtering And Restless Bandits 1 Running Head: PARTICLE FILTERS AND RESTLESS BANDITS Modeling Human Performance in Restless Bandits with Particle Filters

Bandit problems provide an interesting and widely-used setting for the study of sequential decision-making. In their most basic form, bandit problems require people to choose repeatedly between a small number of alternatives, each of which has an unknown rate of providing reward. We investigate restless bandit problems, where the distributions of reward rates for the alternatives change over ti...

متن کامل

On an Index Policy for Restless Bandits

We investigate the optimal allocation of effort to a collection of n projects. The projects are 'restless' in that the state of a project evolves in time, whether or not it is allocated effort. The evolution of the state of each project follows a Markov rule, but transitions and rewards depend on whether or not the project receives effort. The objective is to maximize the expected time-average ...

متن کامل

Restless Bandits, Partial Conservation Laws and Indexability

We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range of admissible linear performance objectives, with both this range and the optimal indices being det...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2023

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v37i10.26415